7 research outputs found
Collaborative Perception From Data Association To Localization
During the last decade, visual sensors have become ubiquitous. One or more cameras
can be found in devices ranging from smartphones to unmanned aerial vehicles and
autonomous cars. During the same time, we have witnessed the emergence of large
scale networks ranging from sensor networks to robotic swarms.
Assume multiple visual sensors perceive the same scene from different viewpoints. In
order to achieve consistent perception, the problem of correspondences between ob-
served features must be first solved. Then, it is often necessary to perform distributed
localization, i.e. to estimate the pose of each agent with respect to a global reference
frame. Having everything set in the same coordinate system and everything having
the same meaning for all agents, operation of the agents and interpretation of the
jointly observed scene become possible.
The questions we address in this thesis are the following: first, can a group of visual
sensors agree on what they see, in a decentralized fashion? This is the problem of
collaborative data association. Then, based on what they see, can the visual sensors
agree on where they are, in a decentralized fashion as well? This is the problem of
cooperative localization.
The contributions of this work are five-fold. We are the first to address the problem
of consistent multiway matching in a decentralized setting. Secondly, we propose
an efficient decentralized dynamical systems approach for computing any number of
smallest eigenvalues and the associated eigenvectors of a weighted graph with global
convergence guarantees with direct applications in group synchronization problems,
e.g. permutations or rotations synchronization. Thirdly, we propose a state-of-the
art framework for decentralized collaborative localization for mobile agents under
the presence of unknown cross-correlations by solving a minimax optimization prob-
lem to account for the missing information. Fourthly, we are the first to present an
approach to the 3-D rotation localization of a camera sensor network from relative
bearing measurements. Lastly, we focus on the case of a group of three visual sensors.
We propose a novel Riemannian geometric representation of the trifocal tensor which
relates projections of points and lines in three overlapping views. The aforemen-
tioned representation enables the use of the state-of-the-art optimization methods on
Riemannian manifolds and the use of robust averaging techniques for estimating the
trifocal tensor
3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach
We investigate the problem of estimating the 3D shape of an object, given a
set of 2D landmarks in a single image. To alleviate the reconstruction
ambiguity, a widely-used approach is to confine the unknown 3D shape within a
shape space built upon existing shapes. While this approach has proven to be
successful in various applications, a challenging issue remains, i.e., the
joint estimation of shape parameters and camera-pose parameters requires to
solve a nonconvex optimization problem. The existing methods often adopt an
alternating minimization scheme to locally update the parameters, and
consequently the solution is sensitive to initialization. In this paper, we
propose a convex formulation to address this problem and develop an efficient
algorithm to solve the proposed convex program. We demonstrate the exact
recovery property of the proposed method, its merits compared to alternative
methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201
Collaborative Perception From Data Association To Localization
During the last decade, visual sensors have become ubiquitous. One or more cameras
can be found in devices ranging from smartphones to unmanned aerial vehicles and
autonomous cars. During the same time, we have witnessed the emergence of large
scale networks ranging from sensor networks to robotic swarms.
Assume multiple visual sensors perceive the same scene from different viewpoints. In
order to achieve consistent perception, the problem of correspondences between ob-
served features must be first solved. Then, it is often necessary to perform distributed
localization, i.e. to estimate the pose of each agent with respect to a global reference
frame. Having everything set in the same coordinate system and everything having
the same meaning for all agents, operation of the agents and interpretation of the
jointly observed scene become possible.
The questions we address in this thesis are the following: first, can a group of visual
sensors agree on what they see, in a decentralized fashion? This is the problem of
collaborative data association. Then, based on what they see, can the visual sensors
agree on where they are, in a decentralized fashion as well? This is the problem of
cooperative localization.
The contributions of this work are five-fold. We are the first to address the problem
of consistent multiway matching in a decentralized setting. Secondly, we propose
an efficient decentralized dynamical systems approach for computing any number of
smallest eigenvalues and the associated eigenvectors of a weighted graph with global
convergence guarantees with direct applications in group synchronization problems,
e.g. permutations or rotations synchronization. Thirdly, we propose a state-of-the
art framework for decentralized collaborative localization for mobile agents under
the presence of unknown cross-correlations by solving a minimax optimization prob-
lem to account for the missing information. Fourthly, we are the first to present an
approach to the 3-D rotation localization of a camera sensor network from relative
bearing measurements. Lastly, we focus on the case of a group of three visual sensors.
We propose a novel Riemannian geometric representation of the trifocal tensor which
relates projections of points and lines in three overlapping views. The aforemen-
tioned representation enables the use of the state-of-the-art optimization methods on
Riemannian manifolds and the use of robust averaging techniques for estimating the
trifocal tensor
Image Segmentation by Medial Axis Decomposition
92 σ.Στην παρούσα διπλωματική εργασία παρουσιάζονται τεχνικές κατάτμησης εικόνων βασισμένες σε μία διαδικασία αποσύνθεσης διάμεσου άξονα. Ξεκινώντας από μια grayscale εικόνα ακμών υπολογίζουμε ένα σταθμισμένο μετασχηματισμό απόστασης και τον αντίστοιχο διάμεσο άξονα. Εφαρμόζοντας, τώρα, τον ίδιο μετασχηματισμό απόστασης από το διάμεσο άξονα ανάποδα, λαμβάνουμε μια αρχική κατάτμηση της εικόνας. Αναπαριστώνας την εικόνα ως δομή disjoint-set forest, ενώνουμε γειτονικές περιοχές βάσει κάποιων κριτηρίων. Κατά το πρώτο από αυτά, χρησιμοποιούμε το ύψος των σημείων σέλλας του διάμεσου άξονα για να εκφράσουμε την ομοιομορφία μεταξύ γειτονικών περιοχών. Μία δεύτερη διαφορετική κατεύθυνση που ακολουθούμε είναι η χρήση ενός μέτρου κλειστότητας των περιοχών της εικόνας για να αποφασίσουμε εάν πρέπει να ενωθούν δύο γειτονικές περιοχές. Τρίτη και τελευταία κατεύθυνση είναι η χρήση ενός μέτρου ανομοιομορφίας μεταξύ περιοχών που ικανοποιεί την υπερμετρική ιδιότητα. Με αυτόν τον τρόπο, υλοποιούμε την κατάτμηση ιεραρχικά. Όλες οι παραπάνω τεχνικές αξιολογούνται με βάση τη συλλογή δεδομένων του πανεπιστημίου Berkeley και συγκρίνονται με γνωστές τεχνικές της βιβλιογραφίας. Χωρίς μάθηση, πετυχαίνουμε πολύ καλά αποτελέσματα και με πρακτικούς χρόνους εκτέλεσης.In the framework of this thesis, we present new
image segmentation techniques based on a weighted
medial axis decomposition procedure. Starting from
image contour map, we compute a weighted distance map
and its weighted medial axis. Applying the same distance
propagation from the medial axis backwards, we dually obtain
an initial image partition and a graph representing image
structure. Using a disjoint-set data structure, we then merge
adjacent regions according to some criteria. Several different
criteria were examined and tested. First, we use medial axis
saddle point height to express similarity between adjacent regions
and merge correspondingly. A second distinct direction we follow is
to merge adjacent regions according to how fragmented they are. Last
but not least, we use ultrametric contour map representation to implement
hierarchical segmentation. As inter-region ultrametric dissimilarities,
we use mean boundary strength on the common boundary between adjacent
regions and inter-region fragmentation. All the above mentioned techniques
are evaluated using the Berkeley Segmentation Dataset and compared with
some state of the art algorithms. Without learning, we achieve performance
near the state of the art with very practical running times.Σπυρίδων Μ. Λεονάρδο